168 research outputs found

    Selective Regression Testing based on Big Data: Comparing Feature Extraction Techniques

    Get PDF
    Regression testing is a necessary activity in continuous integration (CI) since it provides confidence that modified parts of the system are correct at each integration cycle. CI provides large volumes of data which can be used to support regression testing activities. By using machine learning, patterns about faulty changes in the modified program can be induced, allowing test orchestrators to make inferences about test cases that need to be executed at each CI cycle. However, one challenge in using learning models lies in finding a suitable way for characterizing source code changes and preserving important information. In this paper, we empirically evaluate the effect of three feature extraction algorithms on the performance of an existing ML-based selective regression testing technique. We designed and performed an experiment to empirically investigate the effect of Bag of Words (BoW), Word Embeddings (WE), and content-based feature extraction (CBF). We used stratified cross validation on the space of features generated by the three FE techniques and evaluated the performance of three machine learning models using the precision and recall metrics. The results from this experiment showed a significant difference between the models\u27 precision and recall scores, suggesting that the BoW-fed model outperforms the other two models with respect to precision, whereas a CBF-fed model outperforms the rest with respect to recall

    Predicting build outcomes in continuous integration using textual analysis of source code commits

    Get PDF
    Machine learning has been increasingly used to solve various software engineering tasks. One example of its usage is to predict the outcome of builds in continuous integration, where a classifier is built to predict whether new code commits will successfully compile. The aim of this study is to investigate the effectiveness of fifteen software metrics in building a classifier for build outcome prediction. Particularly, we implemented an experiment wherein we compared the effectiveness of a line-level metric and fourteen other traditional software metrics on 49,040 build records that belong to 117 Java projects. We achieved an average precision of 91% and recall of 80% when using the line-level metric for training, compared to 90% precision and 76% recall for the next best traditional software metric. In contrast, using file-level metrics was found to yield a higher predictive quality (average MCC for the best software metric= 68%) than the line-level metric (average MCC= 16%) for the failed builds. We conclude that file-level metrics are better predictors of build outcomes for the failed builds, whereas the line-level metric is a slightly better predictor of passed builds

    Defect Inflow Prediction in Large Software Projects

    Get PDF

    Comparing autoencoder-based approaches for anomaly detection in highway driving scenario images

    Get PDF
    Autoencoder-based anomaly detection approaches can be used for precluding scope compliance failures of the automotive perception. However, the applicability of these approaches for the automotive domain should be thoroughly investigated. We study the capability of two autoencoder-based approaches using reconstruction errors and bottleneck-values for detecting semantic anomalies in automotive images. As a use-case, we consider a specific highway driving scenario identifying if there are any vehicles in the field of view of a front-looking camera. We conduct a series of experiments with two simulated driving scenario datasets and measure anomaly detection performance for different cases. We systematically test different autoencoders and training parameters, as well as the influence of image colors. We show that the autoencoder-based approaches demonstrate promising results for detecting semantic anomalies in highway driving scenario images in some cases. However, we also observe the variability of anomaly detection performance between different experiments. The autoencoder-based approaches are capable of detecting semantic anomalies in highway driving scenario images to some extent. However, further research with other use-cases and real datasets is needed before they can be safely applied in the automotive domain

    Predicting and Evaluating Software Model Growth in the Automotive Industry

    Full text link
    The size of a software artifact influences the software quality and impacts the development process. In industry, when software size exceeds certain thresholds, memory errors accumulate and development tools might not be able to cope anymore, resulting in a lengthy program start up times, failing builds, or memory problems at unpredictable times. Thus, foreseeing critical growth in software modules meets a high demand in industrial practice. Predicting the time when the size grows to the level where maintenance is needed prevents unexpected efforts and helps to spot problematic artifacts before they become critical. Although the amount of prediction approaches in literature is vast, it is unclear how well they fit with prerequisites and expectations from practice. In this paper, we perform an industrial case study at an automotive manufacturer to explore applicability and usability of prediction approaches in practice. In a first step, we collect the most relevant prediction approaches from literature, including both, approaches using statistics and machine learning. Furthermore, we elicit expectations towards predictions from practitioners using a survey and stakeholder workshops. At the same time, we measure software size of 48 software artifacts by mining four years of revision history, resulting in 4,547 data points. In the last step, we assess the applicability of state-of-the-art prediction approaches using the collected data by systematically analyzing how well they fulfill the practitioners' expectations. Our main contribution is a comparison of commonly used prediction approaches in a real world industrial setting while considering stakeholder expectations. We show that the approaches provide significantly different results regarding prediction accuracy and that the statistical approaches fit our data best

    How to Succeed in Communicating Software Metrics in Organization?

    Get PDF
    While software metrics are indispensable for quality assurance, using metrics in practice is complicated. Quality, productivity, speed, and efficiency are important factors to be considered in software development (Holmstrom et al. 2006; Svensson 2005). Measuring correct metrics and using them in the right and transparent way contributes to pushing development in a desirable direction, leading to achieving projected goals and outcomes (Staron and Meding 2018). On the other hand, tracking the wrong metrics, and failing to interpret and communicate them properly results in a stressful work environment, conflicts, distrust, lower engagement, and decreased productivity (de Sá Leitão Júnior 2018; Ellis et al. 1991; Staron 2012). To ensure proper and effective use of metrics in organizations, successful communication around metrics is essential (Lindström et al. 2021; Post et al. 2002; Staron and Meding 2015). The purpose of this study is to understand and improve communication about metrics in contexts of contemporary software development practice in organizations. This is achieved by identifying the bottlenecks in the process of communication around metrics and how to overcome them in practice. Drawing on 38 semi-structured interviews and interactive workshops with metrics teams members and stakeholders from three organizations, we identify three interrelated challenges including limited knowledge about metrics and lack of terminology, uncoordinated use of multiple communication channels, and sensitivity of metrics, which influence workplace communication, trust, and performance. Our study shows the importance of developing metrics terminology to ensure the development of a shared understanding of metrics. Further, raising awareness about the affordances such channels as dashboards, email, MS Teams meetings/chat, stand up meetings, reports, etc., commonly used in software organizations, and how they can be combined to successfully transfer information about metrics is essential (Verhulsdonck and Shah 2020). It becomes especially important in remote work practices. Finally, though metrics is a powerful tool for decision making, enhancing transparency, and steering development in the desired direction, they can also turn into finger-pointing, blaming, and a pressing tool, resulting in stress and conflicts (Streit and Pizka 2011). The findings also indicate the importance of creating a culture around metrics, clarifying, and informing about the purpose of metrics in the organization (Umarji and Seaman 2008). We plan to build on the early findings of this study to develop a comprehensive framework for successful software metrics communication within organizations

    Who are Metrics Team’s Stakeholders and What Do They Expect? Conducting Stakeholder Mapping with Focus on Communication in Agile Software Development Organization

    Get PDF
    As an increasing number of organizations create metrics teams, conducting stakeholder mapping is pivotal for identifying and analyzing metrics stakeholders’ expectations for reducing the risks of miscommunication and project failure. Further, though team-stakeholder communication is essential for successful collaboration, few studies focus on it in software measurement context. This case study seeks to identify and analyze metrics team’s stakeholders, with a special focus on communication challenges in team-stakeholder contacts. Inspired by Bryson\u27s Basic Stakeholder Analysis Techniques and Mitchell, Agle, and Wood\u27s theoretical model for stakeholder identification, a stakeholder mapping exercise was conducted using interactive workshops and follow-up interviews with 16 metrics team members and their stakeholders. The results illustrate the complexity of identifying stakeholders in agile organizations, the importance of developing a metrics culture, and enhancing transparency in team-stakeholder communication. The study aims to contribute to the development of stakeholder theory and offers insights into communication in software engineering context

    Predicting Test Case Verdicts Using TextualAnalysis of Commited Code Churns

    Get PDF
    Background: Continuous Integration (CI) is an agile software development practice that involves producing several clean builds of the software per day. The creation of these builds involve running excessive executions of automated tests, which is hampered by high hardware cost and reduced development velocity. Goal: The goal of our research is to develop a method that reduces the number of executed test cases at each CI cycle.Method: We adopt a design research approach with an infrastructure provider company to develop a method that exploits Ma-chine Learning (ML) to predict test case verdicts for committed sourcecode. We train five different ML models on two data sets and evaluate their performance using two simple retrieval measures: precision and recall. Results: While the results from training the ML models on the first data-set of test executions revealed low performance, the curated data-set for training showed an improvement on performance with respect to precision and recall. Conclusion: Our results indicate that the method is applicable when training the ML model on churns of small size
    • …
    corecore